Skip to content

LLM extension: add ethosu 8w16a and quantize scope plumbing#19876

Open
xingguo01 wants to merge 1 commit into
pytorch:mainfrom
xingguo01:llm-ethosu-8w16a-quantize-scope
Open

LLM extension: add ethosu 8w16a and quantize scope plumbing#19876
xingguo01 wants to merge 1 commit into
pytorch:mainfrom
xingguo01:llm-ethosu-8w16a-quantize-scope

Conversation

@xingguo01
Copy link
Copy Markdown
Collaborator

@xingguo01 xingguo01 commented May 29, 2026

  • adds the ethosu_8w16a PT2E quantization mode
  • introduces shared quantization.quantize_scope handling for Arm backends
  • wires the Arm quantize scope through the LLM export path
  • passes Ethos-U system config and memory mode through the partitioner setup

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

- adds the `ethosu_8w16a` PT2E quantization mode
- introduces shared `quantization.quantize_scope`
  handling for Arm backends
- wires the Arm quantize scope through the LLM export path
- passes Ethos-U system config and memory mode through the
  partitioner setup

Signed-off-by: Xingguo Li <xingguo.li@arm.com>
Change-Id: I3f8446a20fb63670520a6b35484669d8df6f31bf
@xingguo01 xingguo01 requested a review from larryliu0820 as a code owner May 29, 2026 16:09
Copilot AI review requested due to automatic review settings May 29, 2026 16:09
@xingguo01 xingguo01 requested a review from mergennachin as a code owner May 29, 2026 16:09
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 29, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19876

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 4 Unrelated Failures

As of commit e079d39 with merge base 88faab2 (image):

NEW FAILURE - The following job has failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 29, 2026
@xingguo01 xingguo01 added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: arm Changes to the ARM backend delegate help wanted Extra attention is needed labels May 29, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the LLM export PT2E quantization plumbing for Arm backends by adding a new Ethos-U quantization mode and introducing shared “quantize scope” handling across TOSA/Ethos-U/VGF, along with passing Ethos-U compiler flags through the export/partitioner setup.

Changes:

  • Added PT2E quantization mode ethosu_16a8w and wired it through the LLM export path.
  • Introduced shared Arm quantization-scope application logic (full vs linear) for TOSA/Ethos-U/VGF quantizers.
  • Plumbed Ethos-U extra_flags through the partitioner and quantizer setup.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
extension/llm/export/quantizer_lib.py Adds ethosu_16a8w, adds Arm quantize-scope helper, and plumbs scope/flags into Arm quantizers.
extension/llm/export/partitioner_lib.py Passes Ethos-U extra_flags through to EthosUCompileSpec.
extension/llm/export/config/llm_config.py Adds ethosu_16a8w, introduces shared QuantizeScope, and adds Ethos-U extra_flags config.
examples/models/llama/export_llama_lib.py Exposes ethosu_16a8w via CLI and wires Arm quantize-scope + Ethos-U flags into quantizer/partitioner creation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +366 to +371
compile_spec = EthosUCompileSpec(
target,
system_config,
memory_mode,
extra_flags=extra_flags,
)
Comment on lines 231 to 236
"vulkan_8w",
"tosa_8a8w",
"ethosu_8a8w",
"ethosu_16a8w",
"vgf_8a8w",
"vgf_16a8w",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. help wanted Extra attention is needed partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants